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CLAIMS 

1\ A method of determining parameter combinations for automated access to World Wide 
Webvcontent that is accessible based on parameters resulting from real user interactions with a 
World \^ide Web site, said method comprising: 

maintaining at least one log file containing at least one set of parameters resulting 
from ^al user interactions with said World Wide Web site; 

ilyzing said log file to determine parameter combinations for automated access 
to said World Wide Web content. 

2. A method of determining parameter combinations for automated access to World Wide 
Web content that is accessible based on parameters resulting from real user interactions with a 
World Wide Web site, as peAlaim 1, wherein said parameters are entries in HTML forms, said 
analyzing step further comprising 

ranking entries in eac^i set of entries according to their frequency of occurrence; 

for each set of entries Vesulting from unlimited text entries, excluding entries 
ranked below a predetermined number; and 

wherein said parameter combinations are determined by producing combinations 
of entries from each set of entries. 

3. A method of determining parameter combinations for automated access to World Wide 
Web content that is accessible based on parameters resulting from real user interactions with a 
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3 WoVld Wide Web site, as per claim 2, wherein said parameter combinations are determined by 

4 producing all combinations of entries from each set of entries. 

1 4. A method of determining parameter combinations for automated access to World Wide 

2 Web content \hat is accessible based on parameters resulting from real user interactions with a 

3 World Wide Wfeb site, as per claim 2, wherein entries resulting from limited text entries and 

4 unlimited text entribs have stop words removed and remaining words stemmed. 



p 5. A method of determining parameter combinations for automated access to World Wide 

O 

jji Web content that is acces^ble based on parameters resulting from real user interactions with a 

jj ; jj \ 

~| World Wide Web site, as per\laim 1, wherein said log file is maintained by a proxy server that 

8 4 logs communications between aMient and a Web server resulting from real user accesses to said 

O 

1=5 World Wide Web content. 

jjs& 

6. A method of determining parameter combinations for automated access to World Wide 

2 Web content that is accessible based on parameters resulting from real user interactions with a 

3 World Wide Web site, as per claim 1, whereiii said content is automatically accessed using said 

4 parameter combinations. 



1 7. A method of increasing web crawler penetration of Web databases accessible via HTML 

2 forms, said method comprising: 

3 reviewing previous real user queries; 
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4 \ identifying possible queries for said Web crawler from said previous real user 

5 queries by synthesis of entries for any of: predefined sets, limited text entries or unlimited 

6 tex\entries; and 

7 Yproviding said identified queries to said Web crawler during an instantiation of 

8 automate\ access to said Web databases by said Web crawler. 



1 8. A method of increasing web crawler penetration of Web databases accessible via HTML 

, 2 forms, as per claim 7, whebein said previous user queries are maintained in a log file. 

Jf 9. A method of increasing \eb crawler penetration of Web databases accessible via HTML 

nj 

S forms, as per claim 8, wherein said file is maintained by a proxy server. 




CJ 10. A method of increasing web crawW penetration of Web databases accessible via HTML 

S E \ 

jptss \ 

4 forms, as per claim 7, wherein said synthesis\omprises: 

\ 

"3b ranking any entries for predetermined sets; 

4 ranking any entries for limited text entries; 

5 ranking any entries for unlimited text entries; 

6 excluding entries for unlimited text entries racked below a predetermined number; 

7 and 

8 pairing entries from each set of ranked entries. 

1 11. A method of increasing web crawler penetration of Web databases accessible via HTML 

2 forms, as per claim 10, wherein said synthesis further comprises: 
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removing stop words and stemming remaining words for entries resulting from 
limited text entries and unlimited text entries. 



1 12. A method of determining entries for input items of an HTML form for automated 

2 accesses to content contained in a Web database behind said HTML form, said method 

3 comprising: 

4 maintaining a log of real user entries for said input items; 
analyzing said log to determine entry combinations for said input items. 

13. A method of determining entries for input items of an HTML form for automated 
accesses to content contained in\a Web database behind said HTML form, as per claim 12; 
wherein said log file contains at leastWe set of entries, said analyzing step further comprising 
ranking entries in each se\of entries according to their frequency of occurrence; 
for each set of entries resisting from unlimited text entries, excluding entries 
ranked below a predetermined number;Wd 

wherein said automated parameter combinations are determined by producing 
combinations of entries from each set of entires. 
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1 14. A method of determining entries for input items of an HTML form for automated 

2 accesses to content contained in a Web database behind said HTML form, as per claim 13, 

3 wherein said parameter combinations are determined by producing all combinations of entries 

4 from each set of entries. 
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1 \ 15. A method of determining entries for input items of an HTML form for automated 

2 accesses to content contained in a Web database behind said HTML form, as per claim 13, 

3 wherein entries resulting from limited text entries and unlimited text entries have stop words 

4 removed and remaining words stemmed. 

1 16. A methbd of determining entries for input items of an HTML form for automated 

2 accesses to content contained in a Web database behind said HTML form, as per claim 12, 

t| wherein said log file ^maintained by a proxy server that logs communications between a client 

O \ 

j| and a Web server resultingsfrom real user accesses to said World Wide Web content. 



ru \ 

1:1 \ 

H 17. A method of emulating real user access to World Wide Web content dynamically 

M \ 

3 \ . . 

rifc accessible via an HTML form, said method comprising: 

\A \ 



]& maintaining a log containing real user entries into each input item of said HTML 

u ^ \ 

M form; \ 

pj \ 

5 ranking entries for each input iteta according to their frequency of occurrence; 

6 for each unlimited text entry inpkt item, excluding entries ranked below a 

7 predetermined number; \ 

8 determining combinations of entries from each set of entries; and 

9 automatically accessing said content using said combinations of entries. 

1 18. A method of emulating real user access to World Widfe Web content dynamically 

2 accessible via an HTML form, as per claim 17, wherein entries resulting from limited text entries 

3 and unlimited text entries have stop words removed and remaining words stemmed. 
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19\ A method of emulating real user access to World Wide Web content dynamically 
accessible via an HTML form, as per claim 17, wherein said log file is maintained by a proxy 
server that^logs communications between a client and a Web server resulting from real user 
accesses to saM World Wide Web content. 

20. An article of manufacture comprising a computer usable medium having computer 
readable program code efribed therein to determine parameter combinations for automated access 
to World Wide Web content that is accessible based on parameters resulting from user 
interactions with a World WideWeb site, said computer readable program code comprising: 

computer readable Xerogram code for maintaining at least one log file 
representative of real user interactions with said World Wide Web site; 

computer readable prografen code for analyzing said log file to determine 
parameter combinations for automateasaccess to said World Wide Web content. 



21. An article of manufacture comprising aVomputer usable medium having computer 
readable program code embed therein to determine parameter combinations for automated access 
to World Wide Web content that is accessible basefi on parameters resulting from user 
interactions with a World Wide Web site, as per claim 20, wherein said parameters are entries in 
HTML forms, said computer readable program code for analyzing further comprising 

computer readable program code for ranking entries in each set of entries 
according to their frequency of occurrence; and 
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computer readable program code for each set of entries resulting from unlimited 
tex\ entries, excluding entries ranked below a predetermined number; and 

\wherein said parameter combinations are determined by producing combinations 
of entries trom each set of entries. 
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22. An article of manufacture comprising a computer usable medium having computer 
readable program code embed therein to determine parameter combinations for automated access 
to World Wide Web content that is accessible based on parameters resulting from user 
interactions with a World Wide Web site, as per claim 21, wherein said parameter combinations 
are determined by producing all combinations of entries from each set of entries. 

23. An article of manufacture comprising a computer usable medium having computer 
readable program code embed therein to determine parameter combinations for automated access 
to World Wide Web content that is accessible based on parameters resulting from user 
interactions with a World Wide Web site, as per claim 21, wherein entries resulting from limited 
text entries and unlimited text entries have stop words\emoved and remaining words stemmed. 



1 24. An article of manufacture comprising a computer usable medium having computer 

2 readable program code embed therein to determine parameter combinations for automated access 

3 to World Wide Web content that is accessible based on parameters resulting from user 

4 interactions with a World Wide Web site, as per claim 20, wherein said log file is maintained by 

5 a proxy server that logs communications between a client and a Webserver resulting from real 

6 user accesses to said World Wide Web content. 
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1 25. \ An article of manufacture comprising a computer usable medium having computer 

2 readableWogram code embed therein to determine parameter combinations for automated access 

3 to World Wide Web content that is accessible based on parameters resulting from user 

4 interactions witlra World Wide Web site, as per claim 20, wherein said content is automatically 

5 access using said parameter combinations. 



O 



c . 5 

01 

\\ 
5 

fl 
i : 

o 

Pj 



Page 25 of 26 



